首页> 外文OA文献 >Visual Word2Vec (vis-w2v): Learning Visually Grounded Word Embeddings Using Abstract Scenes

【2h】

Visual Word2Vec (vis-w2v): Learning Visually Grounded Word Embeddings Using Abstract Scenes

机译：Visual Word2Vec（vis-w2v）：学习视觉接地的Word嵌入使用抽象场景

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

We propose a model to learn visually grounded word embeddings (vis-w2v) tocapture visual notions of semantic relatedness. While word embeddings trainedusing text have been extremely successful, they cannot uncover notions ofsemantic relatedness implicit in our visual world. For instance, although"eats" and "stares at" seem unrelated in text, they share semantics visually.When people are eating something, they also tend to stare at the food.Grounding diverse relations like "eats" and "stares at" into vision remainschallenging, despite recent progress in vision. We note that the visualgrounding of words depends on semantics, and not the literal pixels. We thususe abstract scenes created from clipart to provide the visual grounding. Wefind that the embeddings we learn capture fine-grained, visually groundednotions of semantic relatedness. We show improvements over text-only wordembeddings (word2vec) on three tasks: common-sense assertion classification,visual paraphrasing and text-based image retrieval. Our code and datasets areavailable online.

机译：我们提出了一个模型来学习视觉基础的单词嵌入（vis-w2v），以捕获语义相关性的视觉概念。尽管使用文本训练的词嵌入非常成功，但它们无法揭示我们视觉世界中隐含的语义相关性概念。例如，尽管“吃”和“盯着”在文本上似乎无关，但是它们在视觉上共享语义。当人们吃东西时，他们也倾向于盯着食物。尽管最近在视力方面取得了进步，但视力仍具有挑战性。我们注意到，单词的视觉基础取决于语义，而不是文字像素。因此，我们使用从剪贴画创建的抽象场景来提供视觉基础。我们发现，我们学习的嵌入捕获了语义相关性的细粒度，基于视觉的概念。我们在以下三个任务上显示了比纯文本词嵌入（word2vec）有所改进：常识断言分类，可视化措词和基于文本的图像检索。我们的代码和数据集可在线获得。

著录项

作者
Kottur, Satwik; Vedantam, Ramakrishna; Moura, José M. F.; Parikh, Devi;
展开▼
作者单位

展开▼
年度 2016
总页数
原文格式 PDF
正文语种
中图分类

相似文献

外文文献
中文文献
专利

1. SynoExtractor: A Novel Pipeline for Arabic Synonym Extraction Using Word2Vec Word Embeddings [J] . Rawan N. Al-Matham, Hend S. Al-Khalifa Complexity . 2021,第a期

机译：SynoExtractor：使用Word2Vec Word Embeddings的阿拉伯语义词义提取一个新型管道
2. Scene Recognition by Joint Learning of DNN from Bag of Visual Words and Convolutional DCT Features [J] . Rehman Abdul, Saleem Summra, Khan Usman Ghani, Applied Artificial Intelligence . 2021,第9a11期

机译：从视觉单词袋和卷积DCT功能的DNN联合学习的场景识别
3. Learning topic of dynamic scene using belief propagation and weighted visual words approach [J] . Liu Chunping, Lin Hui, Gong Shengrong, Soft computing: A fusion of foundations, methodologies and applications . 2015,第1期

机译：基于信念传播和加权视觉词的动态场景学习主题
4. VisualWord2Vec (Vis-W2V): Learning Visually Grounded Word Embeddings Using Abstract Scenes [C] . Satwik Kottur, Ramakrishna Vedantam, José M. F. Moura, IEEE Conference on Computer Vision and Pattern Recognition . 2016

机译：VisualWord2Vec（Vis-W2V）：使用抽象场景学习基于视觉的单词嵌入
5. THE TIME COURSE OF INFORMATION RETRIEVAL FROM VISUALLY PRESENTED PICTURES AND WORDS (SEMANTIC MEMORY). [D] . WALLS, WAYNE F. 1986

机译：从视觉呈现的图片和单词（语义记忆）中检索信息的时间过程。
6. PTPD: predicting therapeutic peptides by deep learning and word2vec [O] . Chuanyan Wu, Rui Gao, Yusen Zhang, 2019

机译：PTPD：通过深度学习和word2vec预测治疗肽
7. Learning Visually-Grounded Words and Syntax for a Scene Description Task [O] . Deb K. Roy 2002

机译：学习场景描述任务的视觉地面单词和语法

Visual Word2Vec (vis-w2v): Learning Visually Grounded Word Embeddings Using Abstract Scenes

摘要

著录项

相似文献

相关主题

期刊订阅